JMIR Medical Informatics — Latest Matching Preprints

1

Neural Networks Accurately Predict Precise Metrics of Hospital Resource Utilization for Total Hip Arthroplasty: A Retrospective Database Study

Abbas, A.; Lex, J. R.; Toor, J.; Khalil, E.; Ravi, B.; Whyne, C.

2025-02-13 orthopedics 10.1101/2025.02.11.25322104 medRxiv

Top 0.1%

23.5%

Show abstract

AbstractO_ST_ABSBackgroundC_ST_ABSTotal hip and knee arthroplasties (THAs and TKAs) are some of the most common and successful surgeries. Predicting their duration of surgery (DOS) and length of stay (LOS) has massive implications for costs and resource management. The purpose of this study was to predict the DOS and LOS of THAs using machine learning models (MLMs) based on preoperative factors. MethodsThe American College of Surgeons (ACS) National Surgical and Quality Improvement (NSQIP) database was queried for elective unilateral THA procedures. Multiple MLMs were constructed to predict DOS and LOS. Models were evaluated according to mean squared error (MSE), buffer accuracy, and classification accuracy. To ensure useful predictions, the results of the models were compared to a mean regressor and previous MLM predictions for primary TKAs. Results196,942 patients were included. The neural network had the best MSE, buffer and training accuracies for both DOS and LOS. For DOS testing, the neural network MSE was 0.916, with the 30-minute buffer and [≤]120 min, >120 min accuracies being 75.4% and 88.5%. For LOS testing, the neural network MSE was 0.567, with the 1-day buffer and [≤]2 days, >2 accuracies being 70.3% and 80.9%. Slightly reduced performance was found for THA compared to TKA for DOS and LOS (3 to 5%), with similar important features identified. ConclusionMLMs based on preoperative factors successfully predicted the DOS and LOS of elective unilateral THAs, with similar performance to TKA. Future work should include operational factors to apply these models to real world resource optimization.

2

Predicting Imminent Health Outcomes from Common Lab Results

Cerejo, J.; Neves, B.; da Silva, N. A.; Moreira, J. M.; Silva, M. J.

2023-03-05 health informatics 10.1101/2023.03.01.23286617 medRxiv

Top 0.1%

22.7%

Show abstract

In recent years, most hospitals have implemented Electronic Health Records to manage and integrate a wide range of medical information, including diagnostics, medication admission and laboratory test results. Certain laboratory variables may serve as indicators of a patients clinical deterioration, making laboratory data a valuable tool for identifying high-risk patients. This work introduces a framework for predicting imminent health outcomes (IHO) of multimorbidity patients using laboratory test data. Our cohort includes 322,316 multimorbidity patients that performed laboratory tests in a large teaching hospital between January 2007 and August 2021. Two Imminent Health Outcomes predictive tools were developed. The first considers all patients in the dataset. The second was developed using a subset of patients with Heart Failure (HF) as the main comorbidity (5% of the entire dataset), considering that HF is a highly prevalent syndrome in multimorbidity patients. This predictive model achieved a reasonable predictive performance (AUROC = 0.718, 95% CI 0.708-0.756, and AUPRC = 0.663, 95% CI 0.630-0.701). C-reactive protein and NT-proBNP are the lab tests that most positively contribute to the prediction of IHO. The IHO predictive tool has the potential to help the medical team identify patients at high-risk of an imminent adverse event, highlighting the laboratory variables that are most contributing to the deterioration of the patient.

3

Assessing the performance of GPT-4 in the filed of osteoarthritis and orthopaedic case consultation

li, j.; Gao, X.; Dou, T.; Gao, Y.; Zhu, W.

2023-08-09 orthopedics 10.1101/2023.08.06.23293735 medRxiv

Top 0.1%

22.4%

Show abstract

BackgroundLarge Language Models (LLMs) like GPT-4 demonstrate potential applications in diverse areas, including healthcare and patient education. This study evaluates GPT-4s competency against osteoarthritis (OA) treatment guidelines from the United States and China and assesses its ability in diagnosing and treating orthopedic diseases. MethodsData sources included OA management guidelines and orthopedic examination case questions. Queries were directed to GPT-4 based on these resources, and its responses were compared with the established guidelines and cases. The accuracy and completeness of GPT-4s responses were evaluated using Likert scales, while case inquiries were stratified into four tiers of correctness and completeness. ResultsGPT-4 exhibited strong performance in providing accurate and complete responses to OA management recommendations from both the American and Chinese guidelines, with high Likert scale scores for accuracy and completeness. It demonstrated proficiency in handling clinical cases, making accurate diagnoses, suggesting appropriate tests, and proposing treatment plans. Few errors were noted in specific complex cases. ConclusionsGPT-4 exhibits potential as an auxiliary tool in orthopedic clinical practice and patient education, demonstrating high accuracy and completeness in interpreting OA treatment guidelines and analyzing clinical cases. Further validation of its capabilities in real-world clinical scenarios is needed.

4

Automated Derivation of Diagnostic Criteria for Lung Cancer using Natural Language Processing on Electronic Health Records: A pilot study.

Houston, A.; Williams, S.; Ricketts, W.; Gutteridge, C.; Tackaberry, C.; Conibear, J.

2024-02-21 health informatics 10.1101/2024.02.20.24303084 medRxiv

Top 0.1%

22.0%

Show abstract

BackgroundThe digitisation of healthcare records has generated vast amounts of unstructured data, presenting opportunities for improvements in disease diagnosis when clinical coding falls short, such as in the recording of patient symptoms. This study presents an approach using natural language processing to extract clinical concepts from free-text which are used to automatically form diagnostic criteria for lung cancer from unstructured secondary-care data. MethodsPatients aged 40 and above who underwent a chest x-ray (CXR) between 2016-2022 were included. ICD-10 and unstructured data were pulled from their electronic health records (EHRs) over the preceding 12 months to the CXR. The unstructured data were processed using named entity recognition to extract symptoms, which were mapped to SNOMED-CT codes. Subsumption of features up the SNOMED-CT hierarchy was used to mitigate against sparse features and a frequency-based criteria, combined with univariate logarithmic probabilities, was applied to select candidate features to take forward to the model development phase. A genetic algorithm was employed to identify the most discriminating features to form the diagnostic criteria. Results75002 patients were included, with 1012 lung cancer diagnoses made within 12 months of the CXR. The best-performing model achieved an AUROC of 0.72. Results showed that an existing disorder of the lung, such as pneumonia, and a cough increased the probability of a lung cancer diagnosis. Anomalies of great vessel, disorder of the retroperitoneal compartment and context-dependent findings, such as pain, statistically reduced the risk of lung cancer, making other diagnoses more likely. The performance of the developed model was compared to the existing cancer risk scores, demonstrating superior performance. ConclusionsThe proposed methods demonstrated success in leveraging unstructured secondary-care data to derive diagnostic criteria for lung cancer, outperforming existing risk tools. These advancements show potential for enhancing patient care and results. However, it is essential to tackle specific limitations by integrating primary care data to ensure a more thorough and unbiased development of diagnostic criteria. Moreover, the study highlights the importance of contextualising SNOMED-CT concepts into meaningful terminology that resonates with clinicians, facilitating a clearer and more tangible understanding of the criteria applied.

5

A novel specific artificial intelligence-based method to identify COVID-19 cases using simple blood exams

Soares, F.

2020-07-03 health informatics 10.1101/2020.04.10.20061036 medRxiv

Top 0.1%

21.9%

Show abstract

BackgroundThe SARS-CoV-2 virus responsible for COVID-19 poses a significant challenge to healthcare systems worldwide. Despite governmental initiatives aimed at containing the spread of the disease, several countries are experiencing unmanageable increases in the demand for ICU beds, medical equipment, and larger testing capacity. Efficient COVID-19 diagnosis enables healthcare systems to provide better care for patients while protecting caregivers from the disease. However, many countries are constrained by the limited amount of test kits available, lack of equipment and trained professionals. In the case of patients visiting emergency rooms (ERs) with a suspect of COVID-19, prompt diagnosis may improve the outcome and even provide information for efficient hospital management. In such a context, a quick, inexpensive and readily available test to perform an initial triage in ERs could help to smooth patient flow, provide better patient care, and reduce the backlog of exams. MethodsIn this Case-control quantitative study, we developed a strategy backed by artificial intelligence to perform an initial screening of suspect COVID-19 patients. We developed a machine learning classifier that takes widely available simple blood exams as input and classifies samples as likely to be positive (having SARS-CoV-2) or negative (not having SARS-CoV-2). Based on this initial classification, positive cases can be referred for further highly sensitive testing (e.g. CT scan, or specific antibodies). We used publicly available data from the Albert Einstein Hospital in Brazil from 5,644 patients. Focusing on simple blood exam figures as main predictors, a sample of 599 subjects that had the fewest missing values for 16 common exams were selected. From these 599 patients, 81 tested positive for SARS-CoV-2 (determined by RT-PCR). Based on the reduced dataset, we built an artificial intelligence classification framework, ER-CoV, aiming at determining if suspect patients arriving in ER were likely to be negative for SARS-CoV-2, that is, to predict if that suspect patient is negative for COVID-19. The primary goal of this investigation is to develop a classifier with high specificity and high negative predictive values, with reasonable sensitivity. FindingsWe identified that our AI framework achieved an average specificity of 85.98% [95%CI: 84.94 - 86.84] and negative predictive value (NPV) of 94.92% [95%CI: 94.37% - 95.37%]. Those values are completely aligned with our goal of providing an effective low-cost system to triage suspect patients in ERs. As for sensitivity, our model achieved an average of 70.25% [95%CI: 66.57% - 73.12%] and positive predictive value (PPV) of 44.96% [95%CI: 43.15% - 46.87%]. The area under the curve (AUC) of the receiver operating characteristic (ROC) was 86.78% [95%CI: 85.65% - 87.90%]. An error analysis (inspection of which patients were misclassified) identified that, on average, 28% of the false negative results would have been hospitalized anyway; thus the model is making mistakes for severe cases that would not be overlooked, partially mitigating the fact that the test is not highly sensitive. All code for our AI model, called ER-CoV is publicly available at https://github.com/soares-f/ER-CoV. InterpretationBased on the capacity of our model to accurately predict which cases are negative from suspect patients arriving in emergency rooms, we envision that this framework may play an important role in patient triage. Probably the most important outcome is related to testing availability, which at this point is extremely low in many countries. Considering the achieved specificity, we could reduce by at least 90% the number of SARS-CoV-2 tests performed in emergency rooms, with around 5% chance of getting a false negative. The second important outcome is related to patient management in hospitals. Patients predicted as positive by our framework could be immediately separated from other patients while waiting for the results of confirmatory tests. This could reduce the spread rate within hospitals since in many of them all suspect cases are kept in the same ward. In Brazil, where the data was collected, rate infection is starting to quickly spread and the lead time of a SARS-CoV-2 may be up to 2 weeks.

6

Multicenter analysis of COVID-19 hospitalizations and stacking machine learning algorithms for prediction of high-risk patients

Shaw, R.; Bassily, D.; Patel, L.; O'Connor, T.; Rafidi, R.; Formanek, P.

2023-06-22 health informatics 10.1101/2023.06.20.23291685 medRxiv

Top 0.1%

21.7%

Show abstract

ObjectiveTo create and validate an ensemble of machine learning algorithms to accurately predict ICU admission or mortality upon initial presentation to the emergency department. MethodsThis is a retrospective cohort study of a multicenter hospital system in the United States. The electronic health record was queried from March 2020 to December 2021 for patients who presented to the emergency department who were subsequently COVID-positive. Associated patient demographics, vitals, and laboratory vitals were obtained. High-risk individuals were defined as those who required ICU admission or died; low-risk individuals did not meet those criteria. The dataset was split into a 3:1 training to testing dataset. A machine learning ensemble stack was built to predict ICU admission and mortality. ResultsOf the 3,142 hospital admissions with a COVID positive test, there were 1,128 (36%) individuals labeled as high-risk, and 2,014 (64%) as low-risk. We obtained 147 unique variables. CRP, LDH, procalcitonin, glucose, anion gap, creatinine, age, oxygen saturation, oxygen device, and obtainment of an ABG were chosen. Six machine learning models were then trained over model-specific hyperparameters, and then assessed on the testing dataset, generating an area under the receiver operator curve of 0.751, with a specificity of 95% in predicting high-risk individuals based on an initial emergency department assessment. ConclusionA novel machine learning model was generated to predict ICU admission and patient mortality from a multicenter hospital system and validated on unseen data.

7

Generating Evidence for Chronic Obstructive Pulmonary Disease (COPD) Clinical Guidelines Using EHR Data

Johnson, A. M.; Adibuzzaman, M.; Griffin, P.; Bikak, M.

2019-09-20 respiratory medicine 10.1101/19006023 medRxiv

Top 0.1%

19.0%

Show abstract

ObjectivesThe aim of this research was to develop data-driven models using electronic health records (EHRs) to conduct clinical studies for predicting clinical outcomes through probabilistic analysis that considers temporal aspects of clinical data. We assess the efficacy of antibiotics treatment and the optimal time of initiation for in-hospitalized diagnosed with acute exacerbation of COPD (AECOPD) as an application to probabilistic modeling. Materials and MethodsWe developed a semi-automatic Markov Chain Monte Carlo (MCMC) modeling and simulation approach that encodes clinical conditions as computable definitions of health states and exact time duration as input for parameter estimations using raw EHR data. We applied the MCMC approach to the MIMIC-III clinical database, where ICD-9 diagnosis codes (491.21, 491.22, and 494.1) were used to identify data for 697 AECOPD patients of which 25.9% were administered antibiotics. ResultsThe average time to antibiotic administration was 27 hours, and 32% of patients were administered vancomycin as the initial antibiotic. The model simulations showed a 50% decrease in mortality rate as the number of patients administered antibiotics increased. There was an estimated 5.5% mortality rate when antibiotics were initially administrated after 48 hours vs 1.8% when antibiotics were initially administrated between 24 and 48 hours. DiscussionOur findings suggest that there may be a mortality benefit in initiation of antibiotics early in patient with severe respiratory failure in settings of COPD exacerbations warranting an ICU admission. ConclusionProbabilistic modeling and simulation methods that considers temporal aspects of raw clinical patient data can be used to adequately generate evidence for clinical guidelines.

8

Artificial Intelligence in Diabetes Care: Evaluating GPT-4's Competency in Reviewing Diabetic Patient Management Plan in Comparison to Expert Review

Mondal, A.; Naskar, A.

2024-04-14 endocrinology 10.1101/2024.04.12.24305732 medRxiv

Top 0.1%

18.5%

Show abstract

BackgroundThe escalating global burden of diabetes necessitates innovative management strategies. Artificial intelligence, particularly large language models like GPT-4, presents a promising avenue for improving guideline adherence in diabetes care. Such technologies could revolutionize patient management by offering personalized, evidence-based treatment recommendations. MethodsA comparative, blinded design was employed, involving 50 hypothetical diabetes mellitus case summaries, emphasizing varied aspects of diabetes management. GPT-4 evaluated each summary for guideline adherence, classifying them as compliant or non-compliant, based on the ADA guidelines. A medical expert, blinded to GPT-4s assessments, independently reviewed the summaries. Concordance between GPT-4 and the experts evaluations was statistically analyzed, including calculating Cohens kappa for agreement. ResultsGPT-4 labelled 30 summaries as compliant and 20 as non-compliant, while the expert identified 28 as compliant and 22 as non-compliant. Agreement was reached on 46 of the 50 cases, yielding a Cohens kappa of 0.84, indicating near-perfect agreement. GPT-4 demonstrated a 92% accuracy, with a sensitivity of 86.4% and a specificity of 96.4%. Discrepancies in four cases highlighted challenges in AIs understanding of complex clinical judgments related to medication adjustments and treatment modifications. ConclusionGPT-4 exhibits promising potential to support health-care professionals in reviewing diabetes management plans for guideline adherence. Despite high concordance with expert assessments, instances of non-agreement underscore the need for AI refinement in complex clinical scenarios. Future research should aim at enhancing AIs clinical reasoning capabilities and exploring its integration with other technologies for improved healthcare delivery.

9

Validation of 13,102 ICD-10-CM Codes Using a Large Language Model-Based System

Wang, Y.; Song, Y.; Siu, R.; Nimma, I. R.; Yan, Y.; Savage, T. R.; Wang, Y.; Li, Z.; Ramai, D.; Wang, J.; Badurdeen, D.; Tao, C.; Kumbhari, V.; Huang, Y.

2025-12-31 health informatics 10.64898/2025.12.30.25343244 medRxiv

Top 0.1%

18.5%

Show abstract

ObjectiveTo comprehensively evaluate the validity of ICD-10-CM codes for both prevalent diagnoses and less common diseases, and to assess the performance of a large language model (LLM)-based system in validating these codes. Materials and MethodsThis retrospective study analyzed hospital admissions from the Medical Information Mart for Intensive Care (MIMIC-IV) database. We developed a validated LLM-based system using GPT-4o, refined through iterative prompt engineering, to assess ICD-10-CM code validity. We measured the PPV of ICD-10-CM codes, PPV of principal and secondary diagnoses, and the performance of an LLM-based system in code validation. ResultsAmong 865,079 assigned codes, the PPV was 84.6% (95% CI, 84.5%-84.6%). Principal diagnoses had a PPV of 93.9% (95% CI, 93.7%-94.1%), while secondary diagnoses had a PPV of 83.8% (95% CI, 83.7%-83.9%). The LLM system demonstrated high performance in validating ICD codes, achieving 93.6% accuracy, 95.4% sensitivity and 85.2% specificity. Among correctly assigned secondary diagnoses, the majority (67.9%) represented historical or baseline conditions, while 32.1% reflected active conditions that deviated from baseline status; 22.3% of these emerged after hospital admission. PPV decreases with later diagnosis positions, with the largest decline occurring between principal and secondary diagnoses. Discussion and ConclusionIn this large-scale evaluation, ICD-10-CM codes exhibited generally high accuracy, though variability existed by position and condition type. A validated LLM system performed comparably to physician review and offers a scalable means to improve coding accuracy. These findings support the potential for integrating LLM-based auditing into routine workflows to strengthen the quality of administrative and research data.

10

Developing and optimizing machine learning algorithms for predicting in-hospital patient charges for Congestive Heart Failure Exacerbations, Chronic Obstructive Pulmonary Disease Exacerbations and Diabetic Ketoacidosis

Arnold, M. C.; Boland, M. R.; Liou, L.

2023-12-18 health informatics 10.1101/2023.12.17.23298944 medRxiv

Top 0.1%

18.3%

Show abstract

BackgroundHospitalizations for exacerbations of congestive heart failure (CHF), chronic obstructive pulmonary disease (COPD) and diabetic ketoacidosis (DKA) are costly in the United States. ObjectiveThe purpose of this study is to predict in-hospital charges for each condition using Machine Learning (ML) models. MethodsWe conducted a retrospective cohort study on national discharge records of hospitalized adult patients from January 1st, 2016, to December 31st, 2019. We used numerous ML techniques to predict in-hospital total cost. ResultsWe found that linear regression (LM), gradient boosting (GBM) and extreme gradient boosting (XGB) models had good predictive performance and were statistically equivalent, with training R-Squared values ranging from 0.49-0.95 for CHF; 0.56-0.95 for COPD; and 0.32-0.99 for DKA. We identified important key features driving costs, including patient age, length-of-stay, number of procedures. and elective/non-elective admission. ConclusionsML methods may be used to accurately predict costs and identify drivers of high cost for COPD exacerbations, CHF exacerbations and DKA. Overall, our findings may inform future studies that seek to decrease the underlying high patient costs for these conditions.

11

Building Prediction Models for 30-Day Readmissions Among ICU Patients Using Both Structured and Unstructured Data in Electronic Health Records

Moerschbacher, A.; He, Z.

2021-08-11 health informatics 10.1101/2021.08.10.21261858 medRxiv

Top 0.1%

18.3%

Show abstract

ICU readmissions are associated with poor outcomes for patients and poor performance of hospitals. Patients who are readmitted have an increased risk of in-hospital deaths; hospitals with a higher readmission rate have a reduced profitability, due to an increase in cost and reduced payments from Medicare and Medicaid programs. Predicting a patients likelihood of being readmitted to the ICU can help reduce early discharges, the risk of in-hospital deaths, and help increase profitability. In this study, we built and evaluated multiple machine learning models to predict 30-day readmission rates of ICU patients in the MIMIC-III database. We used both the structured data including demographics, laboratory tests, comorbidities, and unstructured discharge summaries as the predictors and evaluated different combinations of features. The best performing model in this study Logistic Regression achieved an AUROC of 75.7%. This study shows the potential of leveraging machine learning and deep learning for predicting ICU readmissions.

12

Integrating Real-Time Location Systems with Electronic Medical Records: A Machine Learning Approach for In-Hospital Fall Risk Prediction

Kim, D. W.; Seo, J.; Kwon, S.; Park, C. M.; Han, C.; Kim, Y.; Yoon, D.; Kim, K. M.

2024-03-15 health informatics 10.1101/2024.03.11.24304095 medRxiv

Top 0.1%

18.2%

Show abstract

Hospital falls are the most prevalent adverse event in healthcare, posing significant risks to patient health outcomes and institutional care quality. The effectiveness of several fall prediction models currently in use is limited by various clinical factors. This study explored the efficacy of merging real-time location system (RTLS) data with clinical information to enhance the accuracy of in-hospital fall predictions. The model performances were compared based on the clinical data, RTLS data, and a hybrid approach using various evaluation metrics. The RTLS and integrated clinical data were obtained from 22,201 patients between March 2020 and June 2022. From the initial cohort, 118 patients with falls and 443 patients without falls were included. Predictive models were developed using the XGBoost algorithm across three distinct frameworks: clinical model, RTLS model, and clinical + RTLS model. The model performance was evaluated using metrics, such as AUROC, AUPRC, accuracy, PPV, sensitivity, specificity, and F1 score. Shapley additive explanation values were used to enhance the model interpretability. The clinical model yielded an AUROC of 0.813 and AUPRC of 0.407. The RTLS model demonstrated superior fall prediction capabilities, with an AUROC of 0.842 and AUPRC of 0.480. The clinical + RTLS model excelled further, achieving an AUROC of 0.853 and AUPRC of 0.497. Feature importance analysis revealed that movement patterns of patients on the last day of their stay were significantly associated with falls, together with elevated RDW levels, sedative administration, age. This study underscored the advantages of combining RTLS data with clinical information to predict in-hospital falls more accurately. This innovative technology-driven approach may enhance early fall risk detection during hospitalization, potentially preventing falls, improving patient safety, and contributing to more efficient healthcare delivery.

13

Inhospital Mortality, Readmission, and Prolonged Length of Stay Risk Prediction Leveraging Historical Electronic Health Records

Bopche, R.; Tuset, L. G.; Afset, J. E.; Ehrnström, B.; Damas, J. K.; Nytro, O.

2024-04-16 health informatics 10.1101/2024.04.15.24305875 medRxiv

Top 0.1%

17.8%

Show abstract

ObjectiveThe aim of this study was to investigate predictive capabilities of historical records of patients maintained at hospitals towards predicting an impending adverse outcomes such as, mortality, readmission, and prolonged length of stay (PLOS). MethodsLeveraging a de-identified dataset from a tertiary care university hospital, we developed a eXplainable Artificial Intelligence (XAI) framework combining tree-based and traditional ML models with interpretations, and statistical analysis of predictors of mortality, readmission, and PLOS. ResultsOur framework demonstrated exceptional predictive performance with notable Area Under the Receiver Operating Characteristic (AUROC) of 0.9625 and Area Under the Precision-Recall Curve (AUPRC) of 0.8575 for 30-day mortality at discharge and an AUROC of 0.9545 and AUPRC of 0.8419 at admission. For the readmission and PLOS risk the highest AUROC achieved were 0.8198 and 0.9797 repectively. The tree-based machine learning (ML) models consistently outperformed the traditional ML models in all the four prediction tasks. The key predictors were age, derived temporal features, routine laboratory tests, and diagnostic and procedural codes. ConclusionThe study underscores the potential of leveraging medical history for enhanced predictive analytics in hospitals. We present a accurate and intuitive framework for early warning models that can be easily implemented in the current and developing digital health platforms to accurately predict adverse outcomes.

14

Improving the Prediction of Unplanned 30-day Cancer Readmissions Using Social Determinants of Health: A Geocoding-based Approach

Bindhu, S.; Wu, T.-C.; Shih, H.; Chintalapalli, H.; Liu, H.; Wells, A.; Morrison, C. F.; Hsu, W.-W.; Wu, D. T.

2025-09-02 health informatics 10.1101/2025.08.31.25334806 medRxiv

Top 0.1%

17.6%

Show abstract

Unplanned cancer readmissions present a significant burden on patients and hospitals. Current predictive models often overlook socioeconomic factors such as social determinants of health (SDoH), which have the potential to improve prediction performance, as measured by the Area Under the Receiver Operating Characteristic (AUROC) and the Precision-Recall Curve (AUPRC). To investigate this, the present study developed predictive models using cancer readmission data from a large health system in Hamilton County, OH. The models incorporated geocoding-based SDoH along with clustering techniques and compared machine learning (ML) and deep learning (DL) algorithms. Overall, models, regardless of algorithm type, not using SDoH variables had higher AUROC and AUPRCs. The best-performing ML and DL models are comparable (AUROC = 0.7605 for ML; AUROC = 0.7585 for DL). However, when top-performing models were evaluated across certain organ and system cancers, using SDoH and clustering techniques significantly improved model performance. This was most notable for cancers of the skin, subcutaneous tissue, and breast with improvements of 8.20% in AUROC and 11.04% in AUPRC. For all cancer patient cases, utilizing individualized SDoH information extracted from clinical notes was recommended for future studies.

15

Identifying Sequential Complication and Mortality Patterns in Diabetes Mellitus: Comparisons of Machine Learning Methodologies

Zhou, J.; Lee, S.; Wong, W. T.; Liu, T.; Roever, L.; Jeevaratnam, K.; Wu, W. K.; Wong, I. C.; Tse, G.; Zhang, Q.

2020-12-22 endocrinology 10.1101/2020.12.21.20248646 medRxiv

Top 0.1%

17.6%

Show abstract

BackgroundDiabetes mellitus-related complications adversely affect the quality of life. Better risk-stratified care through mining of sequential complication patterns is needed to enable early detection and prevention. MethodsUnivariable and multivariate logistic regression was used to identify significant variables that can predict mortality. A sequence analysis method termed Prefixspan was applied to identify the most common couple, triple, quadruple, quintuple and sextuple sequential complication patterns in the directed comorbidity pathology network. A knowledge enhanced CPT+ (KCPT+) sequence prediction model is developed to predict the next possible outcome along the progression trajectories of diabetes-related complications. FindingsA total of 14,144 diabetic patients (51% males) were included. Acute myocardial infarction (AMI) without known ischaemic heart disease (IHD) (odds ratio [OR]: 2.8, 95% CI: [2.3, 3.4]), peripheral vascular disease (OR: 2.3, 95% CI: [1.9, 2.8]), dementia (OR: 2.1, 95% CI: [1.8, 2.4]), and IHD with AMI (OR: 2.4, 95% CI: [2.1, 2.6]) are the most important multivariate predictors of mortality. KCPT+ shows high accuracy in predicting mortality (F1 score 0.90, ACU 0.88), osteoporosis (F1 score 0.86, AUC 0.82), ophthalmological complications (F1 score 0.82, AUC 0.82), IHD with AMI (F1 score 0.81, AUC 0.85) and neurological complications (F1 score 0.81, AUC 0.83) with a particular prior complication sequence. InterpretationSequence analysis identifies the most common pattern characteristics of disease-related complications efficiently. The proposed sequence prediction model is accurate and enables clinicians to diagnose the next complication earlier, provide better risk-stratified care, and devise efficient treatment strategies for diabetes mellitus patients.

16

Deep learning-Based Correlation Analysis of Pelvic and Spinal Sequences for Enhanced Sagittal Spinal Alignment Prediction

Song, K.; Qi, H.; Ma, C.; Chi, F. P.; Lin, Y. J.; Yang, Q.; Yang, C.; Wang, B.; Li, C. F.; Zhu, Z. Z.; Li, S. W.; Zhang, G. J.; Lu, W.; Wang, Z.

2023-09-17 orthopedics 10.1101/2023.09.17.23295663 medRxiv

Top 0.1%

17.6%

Show abstract

BackgroundPelvic Incidence (PI) plays a crucial role in surgical planning. However, it is insufficient for accurately predicting spinal alignment parameters, including Sacral Slope, Pelvic Tilt, and Lumbar Lordosis. We have devised an AI-based method for predicting sagittal spinal alignments with enhanced precision. MethodsWe have developed an AI-based system utilizing a Seq2Seq framework to model the spatial correlation between pelvic and spinal key points. This system was trained on a dataset of 337 cases and evaluated using 51 cases obtained from a multi-centre hospital. To address the issue of pelvic rotation, we introduced an Angle Correlation Network. We compared the performance of our AI-based system in predicting spinal alignment against the traditional PI-based method. This comparison was conducted using Mean Absolute Error (MAE) and the Correlation Coefficient (R value) as evaluation metrics. ResultsWe evaluated the performance of our AI-based system for predicting Sacral Slope (SS), Pelvic Tilt (PT), and Lumbar Lordosis (LL) values. The Pearson correlation coefficient of the AI-based method surpassed that of the PI-based method (0.80 vs 0.67 for SS, 0.73 vs 0.52 for PT, and 0.76 vs 0.48 for LL), indicating a more robust linear relationship between AI predictions and actual values. Additionally, the AI-based method exhibited a lower Mean Absolute Error (MAE) compared to the PI-based method for LL (5.52 vs 6.69), signifying enhanced prediction accuracy. ConclusionsIn this study, we demonstrated the potential of an AI-based approach for predicting sagittal spinal alignments with improved precision compared to the traditional PI-based method. The AI-based system, utilizing a Seq2Seq framework and an Angle Correlation Network, exhibited a stronger linear relationship between predicted and actual values for Sacral Slope, Pelvic Tilt, and Lumbar Lordosis, as well as a reduced Mean Absolute Error for Lumbar Lordosis. These findings support the integration of AI in spinal surgery planning and personalized medicine for sagittal alignment evaluation and management.

17

Detecting Fifth Metatarsal Fractures on Radiographs through the Lens of Smartphones: A FIXUS AI Algorithm

Taseh, A.; Shah, A.; Eftekhari, M.; Flaherty, A.; Ebrahimi, A.; Jones, S.; Nukala, V.; Nazarian, A.; Waryasz, G.; Ashkani-Esfahani, S.

2025-07-18 orthopedics 10.1101/2025.07.18.25331772 medRxiv

Top 0.1%

17.4%

Show abstract

BackgroundFifth metatarsal (5MT) fractures are common but challenging to diagnose, particularly with limited expertise or subtle fractures. Deep learning shows promise but faces limitations due to image quality requirements. This study develops a deep learning model to detect 5MT fractures from smartphone-captured radiograph images, enhancing accessibility of diagnostic tools. MethodsA retrospective study included patients aged >18 with 5MT fractures (n=1240) and controls (n=1224). Radiographs (AP, oblique, lateral) from Electronic Health Records (EHR) were obtained and photographed using a smartphone, creating a new dataset (SP). Models using ResNet 152V2 were trained on EHR, SP, and combined datasets, then evaluated on a separate smartphone test dataset (SP-test). ResultsOn validation, the SP model achieved optimal performance (AUROC: 0.99). On the SP-test dataset, the EHR models performance decreased (AUROC: 0.83), whereas SP and combined models maintained high performance (AUROC: 0.99). ConclusionsSmartphone-specific deep learning models effectively detect 5MT fractures, suggesting their practical utility in resource-limited settings.

18

Using the OHDSI network to develop and externally validate a patient-level prediction model for Heart Failure in Type II Diabetes Mellitus.

Williams, R. D.; Reps, J. M.; Kors, J. A.; Ryan, P. B.; Steyerberg, E.; Verhamme, K.; Rijnbeek, P. R.

2021-04-07 endocrinology 10.1101/2021.04.06.21254966 medRxiv

Top 0.1%

14.9%

Show abstract

IntroductionHeart Failure (HF) and Type 2 Diabetes Mellitus (T2DM) frequently coexist and exacerbate symptoms of each other. Treatments are available for T2DM that also provide beneficial treatment effects for HF. Guidelines recommend that patients with HF should be given Sodium-glucose co-transporter-2 inhibitors in preference to other second-line treatments for T2DM. Increasing personalization of treatment means that patients who have or are at risk of HF receive a customised treatment. We aimed to develop and externally validate prediction models to predict the 1-year risk of incident HF in T2DM patients starting second-line treatment. MethodsWe analysed a federated network of electronic medical records and administrative claims data from five databases (CCAE, MDCD, MDCR, Optum Clinformatics and Optum EHR) in the United States. We used each database to develop a model to predict 1-year risk of incident HF in patients initialising a second pharmaceutical intervention, following initial treatment with metformin for T2DM. We then performed internal validation for each model as well as external validation using the other databases. ResultsA total of 403,187 patients were included in the study. We developed 5 models with discrimination ranging from 0.72 to 0.80 at external validation in the other databases. Consistent high performance was noted for models developed in CCAE, Optum Clinformatics and Optum EHR with AUCs ranging from 0.74 to 0.81. For these models, calibration was acceptable. ConclusionThree high-performing prediction models were developed for this problem. The CCAE developed model was selected for recommendation as it achieved the same discrimination and better calibration performance than the Optum Clinformatics and Optum EHR models, whilst selecting fewer covariates and as such was selected as the best developed model. The models could be useful in stratifying patient treatment, planning healthcare utilization and reducing cost by aiding in increasing preparedness of healthcare providers.

19

Artificial Intelligence-Driven Innovations in Diabetes Care and Monitoring

Abdul Rahman, S.; Mahadi, M.; Yuliana, D.; Budi Susilo, Y. K.; Ariffin, A. E.; Amgain, K.

2025-06-02 endocrinology 10.1101/2025.06.02.25328795 medRxiv

Top 0.1%

14.5%

Show abstract

This study explores Artificial Intelligence (AI)s transformative role in diabetes care and monitoring, focusing on innovations that optimize patient outcomes. AI, particularly machine learning and deep learning, significantly enhances early detection of complications like diabetic retinopathy and improves screening efficacy. The methodology employs a bibliometric analysis using Scopus, VOSviewer, and Publish or Perish, analyzing 235 articles from 2023-2025. Results indicate a strong interdisciplinary focus, with Computer Science and Medicine being dominant subject areas (36.9% and 12.9% respectively). Bibliographic coupling reveals robust international collaborations led by the U.S. (1558.52 link strength), UK, and China, with key influential documents by Zhu (2023c) and Annuzzi (2023). This research highlights AIs impact on enhancing monitoring, personalized treatment, and proactive care, while acknowledging challenges in data privacy and ethical deployment. Future work should bridge technological advancements with real-world implementation to create equitable and efficient diabetes care systems.

20

Enhanced Diabetes Prediction Using Novel Additive-Multiplicative Neural Networks: A Comprehensive Machine Learning Analysis of the PIMA Indians Dataset

Demirel, S.; Aytekin, K.; agraz, m.

2025-09-22 endocrinology 10.1101/2025.09.20.25336250 medRxiv

Top 0.1%

14.4%

Show abstract

BackgroundEarly diabetes detection remains challenging, requiring robust machine learning approaches that balance accuracy with clinical interpretability for effective diagnostic support. MethodsWe are proposing a novel Additive and Multiplicative Neurons Network (AMNN) that combines both additive and multiplicative computational pathways to capture complex nonlinear relationships in diabetes prediction. Using the PIMA Indians Diabetes dataset (n=768), we compared AMNN against nine established algorithms including XGBoost, KAN, and traditional neural networks. Data preprocessing included SMOTE oversampling for class imbalance, and model interpretability was enhanced through SHAP and LIME explainable AI techniques. ResultsThe AMNN model outperformed all baseline approaches, achieving 75.76% accuracy, a 76.18% F1-score, and an AUC-ROC of 0.8206. Across both traditional feature selection techniques and explainable AI analyses, glucose levels, BMI, age, and pregnancy count consistently emerged as the most influential predictors. ConclusionsThe AMNN framework demonstrates strong potential for diabetes prediction by balancing accuracy with clinical interpretability. The key predictors it highlights align closely with established medical knowledge, reinforcing confidence in its outputs and suitability for use in clinical decision-making workflows. This hybrid neural network approach represents a promising step toward transparent, AI-assisted diagnostic tools that can support healthcare professionals in practice.